Incorporating Categorical Variables

In R

categorical variables can have either character or factor data types


factor – structured & fixed number of levels / options

  • can be ordered or unordered


character – unstructured & variable number of levels

  • is inherently unordered

Incorporating Categorical Variables into Data Visualizations

  • As a variable on the x- or y-axis

  • As a color / fill

  • As a facet

Salamander Size

ggplot(data = salamander, 
       mapping = aes(x = length_2_mm)) + 
  geom_histogram(binwidth = 14) + 
  labs(x = "Snout to Tail Length (mm)")


How would this histogram look if there was no variation in salamander length?


What are possible causes for the variation in salamander length?

Faceted Histograms

ggplot(data = salamander, 
       mapping = aes(x = length_1_mm)) + 
  geom_histogram(binwidth = 14) + 
  facet_wrap(~ section, scales = "free") +
  labs(x = "Snout to Tail Length (mm)")

What do you think scales = "free" does?

Side-by-Side Boxplots

ggplot(data = salamander, 
       mapping = aes(x = length_1_mm, 
                     y = species)
         ) + 
  geom_boxplot() + 
  labs(x = "Snout to Tail Length (mm)", 
       y = "Salamander Species") 

ggplot(data = salamander, 
       mapping = aes(y = length_1_mm, 
                     x = species)
         ) + 
  geom_boxplot() + 
  labs(y = "Snout to Tail Length (mm)", 
       x = "Salamander Species")

Which orientation do you prefer? Vertical or horizontal?

Colors in Boxplots

ggplot(data = salamander, 
       mapping = aes(x = length_1_mm, 
                       y = species, 
                       color = unittype)
         ) + 
  geom_boxplot() + 
  labs(x = "Snout to Tail Length (mm)", 
       y = "Salamander Species", 
       color = "Channel Type")

Why are there only two boxplots for the Olympic torrent salamander?

Facets & Colors in Boxplots

ggplot(data = salamander, 
       mapping = aes(x = length_1_mm, 
                       y = species, 
                       color = section)
         ) + 
  geom_boxplot() + 
  facet_wrap(~ unittype) + 
  labs(x = "Snout to Tail Length (mm)", 
       y = "Salamander Species", 
       color = "Section in Mack Creek")

Facets & Colors in Boxplots

Facets & Color in Scatterplots

ggplot(data = salamander, 
       mapping = aes(x = length_1_mm, 
                       y = weight_g, 
                       color = section)
         ) + 
  geom_point() + 
  facet_wrap(~species, scales = "free") +
  labs(y = "Snout to Tail Length (mm)", 
       x = "Year", 
       color = "Salamander Species")

Facets & Color in Scatterplots

Your Turn

What are the aesthetics included in this plot?